In [ ]:
 
In [ ]:
 
In [449]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
In [450]:
import os
In [451]:
os.listdir(r"/Users/mani/Documents/Data analytics projects/Datasets/")
Out[451]:
['other-Lyft_B02510.csv',
 'other-FHV-services_jan-aug-2015.csv',
 'other-Firstclass_B01536.csv',
 'other-Skyline_B00111.csv',
 'uber-raw-data-janjune-15_sample.csv',
 'uber-raw-data-janjune-15.csv',
 'other-American_B01362.csv',
 'uber-raw-data-apr14.csv',
 'Uber-Jan-Feb-FOIL.csv',
 'other-Highclass_B01717.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-sep14.csv',
 'uber-raw-data-jul14.csv',
 'other-Federal_02216.csv',
 'uber-raw-data-jun14.csv',
 'other-Carmel_B00256.csv',
 'other-Diplo_B01196.csv',
 'other-Dial7_B00887.csv',
 'uber-raw-data-may14.csv',
 'other-Prestige_B01338.csv']
In [ ]:
 
In [452]:
uber_15 = pd.read_csv(r"/Users/mani/Documents/Data analytics projects/Datasets/uber-raw-data-janjune-15_sample.csv")
In [453]:
uber_15.shape
Out[453]:
(100000, 4)
In [ ]:
 
In [454]:
type(uber_15)
Out[454]:
pandas.core.frame.DataFrame
In [ ]:
 
In [455]:
uber_15.duplicated().sum()
Out[455]:
54
In [ ]:
 
In [456]:
print("The dataset contained 54 duplicated entries, which were removed to ensure the integrity of the analysis. After removing duplicates, the dataset now has 99,946 rows.")
The dataset contained 54 duplicated entries, which were removed to ensure the integrity of the analysis. After removing duplicates, the dataset now has 99,946 rows.
In [ ]:
 
In [457]:
uber_15.drop_duplicates(inplace=True)
In [458]:
uber_15.duplicated().sum()
Out[458]:
0
In [459]:
uber_15.shape
Out[459]:
(99946, 4)
In [ ]:
 
In [460]:
uber_15.dtypes
Out[460]:
Dispatching_base_num    object
Pickup_date             object
Affiliated_base_num     object
locationID               int64
dtype: object
In [ ]:
 
In [461]:
print("The dataset columns have a mix of data types, including object for categorical variables and int64 for numerical variables. The Pickup_date was originally of type object but was later converted to datetime for time-series analysis.")
The dataset columns have a mix of data types, including object for categorical variables and int64 for numerical variables. The Pickup_date was originally of type object but was later converted to datetime for time-series analysis.
In [ ]:
 
In [462]:
uber_15.isnull().sum()
Out[462]:
Dispatching_base_num       0
Pickup_date                0
Affiliated_base_num     1116
locationID                 0
dtype: int64
In [463]:
print("There are 1,116 missing values in the Affiliated_base_num column, which represents the affiliated base for each Uber ride. All other columns have no missing data, indicating completeness in other dimensions of the dataset.")
There are 1,116 missing values in the Affiliated_base_num column, which represents the affiliated base for each Uber ride. All other columns have no missing data, indicating completeness in other dimensions of the dataset.
In [ ]:
 
In [464]:
uber_15['Pickup_date'][0]
Out[464]:
'2015-05-02 21:43:00'
In [465]:
#The Pickup_date was initially in string format but was successfully converted to a datetime format.
In [466]:
type(uber_15['Pickup_date'][0])
Out[466]:
str
In [467]:
uber_15['Pickup_date']= pd.to_datetime(uber_15['Pickup_date'])
In [ ]:
 
In [468]:
print("The Pickup_date column was successfully converted from a string to a datetime format, allowing for further time-based analysis, such as extracting the hour, weekday, and month.")
The Pickup_date column was successfully converted from a string to a datetime format, allowing for further time-based analysis, such as extracting the hour, weekday, and month.
In [ ]:
 
In [469]:
uber_15['Pickup_date'].dtype
Out[469]:
dtype('<M8[ns]')
In [470]:
uber_15['Pickup_date'][0]
Out[470]:
Timestamp('2015-05-02 21:43:00')
In [471]:
type(uber_15['Pickup_date'][0])
Out[471]:
pandas._libs.tslibs.timestamps.Timestamp
In [472]:
uber_15.dtypes
Out[472]:
Dispatching_base_num            object
Pickup_date             datetime64[ns]
Affiliated_base_num             object
locationID                       int64
dtype: object
In [ ]:
 
In [473]:
uber_15['month']=uber_15['Pickup_date'].dt.month_name()
In [474]:
uber_15['month']
Out[474]:
0            May
1        January
2          March
3          April
4          March
          ...   
99995      April
99996      March
99997      March
99998        May
99999       June
Name: month, Length: 99946, dtype: object
In [475]:
uber_15['month'].value_counts().plot(kind='bar')
Out[475]:
<Axes: xlabel='month'>
No description has been provided for this image
In [476]:
print("The bar chart illustrates the distribution of Uber rides across different months. This plot can help identify any seasonal trends or variations in Uber ride demand throughout the year, showing peaks and dips in ride counts per month.")
The bar chart illustrates the distribution of Uber rides across different months. This plot can help identify any seasonal trends or variations in Uber ride demand throughout the year, showing peaks and dips in ride counts per month.
In [ ]:
 
In [477]:
uber_15['weekday'] = uber_15['Pickup_date'].dt.day_name()
uber_15['day'] = uber_15['Pickup_date'].dt.day
uber_15['hour'] = uber_15['Pickup_date'].dt.hour
uber_15['minute'] = uber_15['Pickup_date'].dt.minute
In [478]:
uber_15.head(4)
Out[478]:
Dispatching_base_num Pickup_date Affiliated_base_num locationID month weekday day hour minute
0 B02617 2015-05-02 21:43:00 B02764 237 May Saturday 2 21 43
1 B02682 2015-01-20 19:52:59 B02682 231 January Tuesday 20 19 52
2 B02617 2015-03-19 20:26:00 B02617 161 March Thursday 19 20 26
3 B02764 2015-04-10 17:38:00 B02764 107 April Friday 10 17 38
In [ ]:
 
In [479]:
pivot1 = pd.crosstab(index=uber_15['month'], columns=uber_15['weekday'])
In [480]:
pivot1
Out[480]:
weekday Friday Monday Saturday Sunday Thursday Tuesday Wednesday
month
April 2365 1833 2508 2052 2823 1880 2521
February 2655 1970 2550 2183 2396 2129 2013
January 2508 1353 2745 1651 2378 1444 1740
June 2793 2848 3037 2485 2767 3187 2503
March 2465 2115 2522 2379 2093 2388 2007
May 3262 1865 3519 2944 2627 2115 2328
In [481]:
pivot1.plot(kind='bar')
Out[481]:
<Axes: xlabel='month'>
No description has been provided for this image
In [ ]:
 
In [482]:
summary= uber_15.groupby(['weekday','hour'], as_index=False).size()
In [483]:
summary
Out[483]:
weekday hour size
0 Friday 0 581
1 Friday 1 333
2 Friday 2 197
3 Friday 3 138
4 Friday 4 161
... ... ... ...
163 Wednesday 19 1044
164 Wednesday 20 897
165 Wednesday 21 949
166 Wednesday 22 900
167 Wednesday 23 669

168 rows × 3 columns

In [484]:
plt.figure(figsize=(7,5))
sns.pointplot(x="hour", y="size", hue="weekday", data=summary)
Out[484]:
<Axes: xlabel='hour', ylabel='size'>
No description has been provided for this image
In [ ]:
 
In [485]:
print("This point plot provides a detailed view of the average ride size throughout different hours of the day, broken down by each weekday. Key takeaways from this visualization include:\n\nPeak Hours for Uber Rides:\n\nThere is a clear surge in the number of rides during early morning hours (around 7-9 AM) and evening hours (around 5-8 PM). These peaks correspond to common commuting times on weekdays, reflecting a pattern where people are likely using Uber to travel to and from work.\n\nOn weekends, the peak hours tend to shift slightly later in the day, particularly in the late afternoon and evening, suggesting that Uber is used more for leisure or social events during these times.\n\nWeekday vs. Weekend Patterns:\n\nWeekdays (Monday to Friday) exhibit more consistent demand, with pronounced spikes during commuting hours. This suggests that Uber is predominantly used as a transport option for work-related commuting.\n\nOn weekends (Saturday and Sunday), the demand pattern changes. The peak occurs later in the day, and there is a more gradual increase in ride sizes, likely due to people using Uber for leisure activities or social events.\n\nOff-Peak Hours:\n\nLate-night and early-morning hours (between midnight and 6 AM) show consistently lower demand, regardless of the day of the week. However, weekends do show a slight increase in late-night demand, possibly due to late-night outings or events.\n\nConclusion:\n\nThis analysis of ride size by hour and weekday highlights important usage trends, particularly regarding how Uber rides are distributed across different hours and days. The distinct peaks during commuting times on weekdays and later peaks during weekends provide actionable insights into user behavior, which could be useful for understanding demand patterns and optimizing resource allocation for Uber.")
This point plot provides a detailed view of the average ride size throughout different hours of the day, broken down by each weekday. Key takeaways from this visualization include:

Peak Hours for Uber Rides:

There is a clear surge in the number of rides during early morning hours (around 7-9 AM) and evening hours (around 5-8 PM). These peaks correspond to common commuting times on weekdays, reflecting a pattern where people are likely using Uber to travel to and from work.

On weekends, the peak hours tend to shift slightly later in the day, particularly in the late afternoon and evening, suggesting that Uber is used more for leisure or social events during these times.

Weekday vs. Weekend Patterns:

Weekdays (Monday to Friday) exhibit more consistent demand, with pronounced spikes during commuting hours. This suggests that Uber is predominantly used as a transport option for work-related commuting.

On weekends (Saturday and Sunday), the demand pattern changes. The peak occurs later in the day, and there is a more gradual increase in ride sizes, likely due to people using Uber for leisure activities or social events.

Off-Peak Hours:

Late-night and early-morning hours (between midnight and 6 AM) show consistently lower demand, regardless of the day of the week. However, weekends do show a slight increase in late-night demand, possibly due to late-night outings or events.

Conclusion:

This analysis of ride size by hour and weekday highlights important usage trends, particularly regarding how Uber rides are distributed across different hours and days. The distinct peaks during commuting times on weekdays and later peaks during weekends provide actionable insights into user behavior, which could be useful for understanding demand patterns and optimizing resource allocation for Uber.
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [486]:
os.listdir(r"/Users/mani/Documents/Data analytics projects/Datasets/")
Out[486]:
['other-Lyft_B02510.csv',
 'other-FHV-services_jan-aug-2015.csv',
 'other-Firstclass_B01536.csv',
 'other-Skyline_B00111.csv',
 'uber-raw-data-janjune-15_sample.csv',
 'uber-raw-data-janjune-15.csv',
 'other-American_B01362.csv',
 'uber-raw-data-apr14.csv',
 'Uber-Jan-Feb-FOIL.csv',
 'other-Highclass_B01717.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-sep14.csv',
 'uber-raw-data-jul14.csv',
 'other-Federal_02216.csv',
 'uber-raw-data-jun14.csv',
 'other-Carmel_B00256.csv',
 'other-Diplo_B01196.csv',
 'other-Dial7_B00887.csv',
 'uber-raw-data-may14.csv',
 'other-Prestige_B01338.csv']
In [ ]:
 
In [487]:
uber_foil=pd.read_csv(r"/Users/mani/Documents/Data analytics projects/Datasets/Uber-Jan-Feb-FOIL.csv")
In [488]:
uber_foil.shape
Out[488]:
(354, 4)
In [489]:
uber_foil.head(3)
Out[489]:
dispatching_base_number date active_vehicles trips
0 B02512 1/1/2015 190 1132
1 B02765 1/1/2015 225 1765
2 B02764 1/1/2015 3427 29421
In [490]:
!pip install chart_studio
!pip install plotly
Requirement already satisfied: chart_studio in /opt/anaconda3/lib/python3.12/site-packages (1.1.0)
Requirement already satisfied: plotly in /opt/anaconda3/lib/python3.12/site-packages (from chart_studio) (5.22.0)
Requirement already satisfied: requests in /opt/anaconda3/lib/python3.12/site-packages (from chart_studio) (2.32.2)
Requirement already satisfied: retrying>=1.3.3 in /opt/anaconda3/lib/python3.12/site-packages (from chart_studio) (1.3.4)
Requirement already satisfied: six in /opt/anaconda3/lib/python3.12/site-packages (from chart_studio) (1.16.0)
Requirement already satisfied: tenacity>=6.2.0 in /opt/anaconda3/lib/python3.12/site-packages (from plotly->chart_studio) (8.2.2)
Requirement already satisfied: packaging in /opt/anaconda3/lib/python3.12/site-packages (from plotly->chart_studio) (23.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/anaconda3/lib/python3.12/site-packages (from requests->chart_studio) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /opt/anaconda3/lib/python3.12/site-packages (from requests->chart_studio) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/anaconda3/lib/python3.12/site-packages (from requests->chart_studio) (2.2.2)
Requirement already satisfied: certifi>=2017.4.17 in /opt/anaconda3/lib/python3.12/site-packages (from requests->chart_studio) (2024.7.4)
Requirement already satisfied: plotly in /opt/anaconda3/lib/python3.12/site-packages (5.22.0)
Requirement already satisfied: tenacity>=6.2.0 in /opt/anaconda3/lib/python3.12/site-packages (from plotly) (8.2.2)
Requirement already satisfied: packaging in /opt/anaconda3/lib/python3.12/site-packages (from plotly) (23.2)
In [491]:
import chart_studio.plotly as py
import plotly.graph_objs as go
import plotly.express as px

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
In [492]:
init_notebook_mode(connected=True)
In [493]:
uber_foil.columns
Out[493]:
Index(['dispatching_base_number', 'date', 'active_vehicles', 'trips'], dtype='object')
In [494]:
px.box(x='dispatching_base_number', y='active_vehicles',data_frame=uber_foil)
In [ ]:
 
In [495]:
print("Observation:\n\nThe box plot visualizes the distribution of active vehicles for each dispatching base, providing insights into the variability and central tendencies of the data across different bases. Key takeaways from this plot include:\n\nVariation in Vehicle Activity Across Bases:\n\nSome dispatching bases have a wider range of active vehicles, indicating greater variability in the number of cars deployed from those bases on any given day.\n\nConversely, some bases may have a more consistent number of active vehicles, as reflected by smaller interquartile ranges (IQR), which indicates a more stable or uniform operational scale.\n\nOutliers and Extreme Values:\n\nThe presence of outliers in certain bases shows that on some occasions, the number of active vehicles might have been significantly higher or lower than usual. These outliers could represent special events, unusual demand surges, or operational anomalies.\n\nIdentifying such outliers can be useful for exploring cases where operational capacity was exceeded or when fewer vehicles were deployed than expected.\n\nComparing Dispatch Bases:\n\nThe comparison across multiple dispatching bases reveals which hubs consistently manage a higher or lower number of active vehicles. Some bases might have a higher median number of active vehicles, suggesting that they are responsible for a larger share of the operations.\n\nThe distribution patterns across bases can help Uber optimize fleet management and ensure that the right number of vehicles are dispatched based on historical data.\n\nOperational Efficiency:\n\nBases with tightly clustered data points (small IQR) likely run more predictably and efficiently, while bases with wider distributions may face operational challenges, such as fluctuating demand or driver availability.\n\nConclusion:\n\nThis box plot helps highlight the distribution and variability in the number of active vehicles across different dispatching bases. By identifying outliers and understanding the spread of data, Uber can fine-tune its resource allocation, address any anomalies, and ensure smoother operations at each base.")
Observation:

The box plot visualizes the distribution of active vehicles for each dispatching base, providing insights into the variability and central tendencies of the data across different bases. Key takeaways from this plot include:

Variation in Vehicle Activity Across Bases:

Some dispatching bases have a wider range of active vehicles, indicating greater variability in the number of cars deployed from those bases on any given day.

Conversely, some bases may have a more consistent number of active vehicles, as reflected by smaller interquartile ranges (IQR), which indicates a more stable or uniform operational scale.

Outliers and Extreme Values:

The presence of outliers in certain bases shows that on some occasions, the number of active vehicles might have been significantly higher or lower than usual. These outliers could represent special events, unusual demand surges, or operational anomalies.

Identifying such outliers can be useful for exploring cases where operational capacity was exceeded or when fewer vehicles were deployed than expected.

Comparing Dispatch Bases:

The comparison across multiple dispatching bases reveals which hubs consistently manage a higher or lower number of active vehicles. Some bases might have a higher median number of active vehicles, suggesting that they are responsible for a larger share of the operations.

The distribution patterns across bases can help Uber optimize fleet management and ensure that the right number of vehicles are dispatched based on historical data.

Operational Efficiency:

Bases with tightly clustered data points (small IQR) likely run more predictably and efficiently, while bases with wider distributions may face operational challenges, such as fluctuating demand or driver availability.

Conclusion:

This box plot helps highlight the distribution and variability in the number of active vehicles across different dispatching bases. By identifying outliers and understanding the spread of data, Uber can fine-tune its resource allocation, address any anomalies, and ensure smoother operations at each base.
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [496]:
px.violin(x='dispatching_base_number', y='active_vehicles',data_frame=uber_foil)
In [ ]:
 
In [497]:
print("Observation: A violin plot is generated to compare the distribution of active_vehicles across different dispatching_base_number categories. The plot combines aspects of a box plot and a density plot, helping to visualize both the spread of the data and the probability density of vehicle counts across different dispatching bases.")
Observation: A violin plot is generated to compare the distribution of active_vehicles across different dispatching_base_number categories. The plot combines aspects of a box plot and a density plot, helping to visualize both the spread of the data and the probability density of vehicle counts across different dispatching bases.
In [ ]:
 
In [ ]:
 
In [498]:
os.listdir(r"/Users/mani/Documents/Data analytics projects/Datasets/")
Out[498]:
['other-Lyft_B02510.csv',
 'other-FHV-services_jan-aug-2015.csv',
 'other-Firstclass_B01536.csv',
 'other-Skyline_B00111.csv',
 'uber-raw-data-janjune-15_sample.csv',
 'uber-raw-data-janjune-15.csv',
 'other-American_B01362.csv',
 'uber-raw-data-apr14.csv',
 'Uber-Jan-Feb-FOIL.csv',
 'other-Highclass_B01717.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-sep14.csv',
 'uber-raw-data-jul14.csv',
 'other-Federal_02216.csv',
 'uber-raw-data-jun14.csv',
 'other-Carmel_B00256.csv',
 'other-Diplo_B01196.csv',
 'other-Dial7_B00887.csv',
 'uber-raw-data-may14.csv',
 'other-Prestige_B01338.csv']
In [499]:
files = os.listdir(r"/Users/mani/Documents/Data analytics projects/Datasets/")[7:19]
In [500]:
files
Out[500]:
['uber-raw-data-apr14.csv',
 'Uber-Jan-Feb-FOIL.csv',
 'other-Highclass_B01717.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-sep14.csv',
 'uber-raw-data-jul14.csv',
 'other-Federal_02216.csv',
 'uber-raw-data-jun14.csv',
 'other-Carmel_B00256.csv',
 'other-Diplo_B01196.csv',
 'other-Dial7_B00887.csv',
 'uber-raw-data-may14.csv']
In [501]:
files.remove('uber-raw-data-apr14.csv')
In [502]:
files.remove('Uber-Jan-Feb-FOIL.csv')
In [503]:
files.remove('other-Highclass_B01717.csv')
In [504]:
files.remove('other-Federal_02216.csv')
In [505]:
files.remove('other-Carmel_B00256.csv')
In [506]:
files.remove('other-Diplo_B01196.csv')
In [507]:
files.remove('uber-raw-data-may14.csv')
In [508]:
files
Out[508]:
['uber-raw-data-aug14.csv',
 'uber-raw-data-sep14.csv',
 'uber-raw-data-jul14.csv',
 'uber-raw-data-jun14.csv',
 'other-Dial7_B00887.csv']
In [509]:
final = pd.DataFrame()
path = r"/Users/mani/Documents/Data analytics projects/Datasets/"

for file in files:
    current_df = pd.read_csv(path+'/'+file)
    final = pd.concat([current_df , final])
In [510]:
final.shape
Out[510]:
(3512368, 10)
In [511]:
final.duplicated().sum()
Out[511]:
67003
In [512]:
final.drop_duplicates(inplace=True)
In [513]:
final.shape
Out[513]:
(3445365, 10)
In [514]:
final.head(3)
Out[514]:
Date Time State PuFrom Address Street Date/Time Lat Lon Base
0 2014.07.06 14:30 NY ... MANHATTAN 50 MURRAY ST NaN NaN NaN NaN
1 2014.07.04 7:15 NY ... MANHATTAN 143 AVENUE B NaN NaN NaN NaN
2 2014.07.05 5:45 NY ... MANHATTAN 125 CHRISTOPHER ST NaN NaN NaN NaN
In [ ]:
 
In [515]:
rush_uber = final.groupby(['Lat','Lon'], as_index=False).size()
In [516]:
rush_uber.head(3)
Out[516]:
Lat Lon size
0 39.6569 -74.2258 1
1 39.6686 -74.1607 1
2 39.7214 -74.2446 1
In [ ]:
 
In [517]:
!pip install folium
Requirement already satisfied: folium in /opt/anaconda3/lib/python3.12/site-packages (0.17.0)
Requirement already satisfied: branca>=0.6.0 in /opt/anaconda3/lib/python3.12/site-packages (from folium) (0.7.2)
Requirement already satisfied: jinja2>=2.9 in /opt/anaconda3/lib/python3.12/site-packages (from folium) (3.1.4)
Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.12/site-packages (from folium) (1.26.4)
Requirement already satisfied: requests in /opt/anaconda3/lib/python3.12/site-packages (from folium) (2.32.2)
Requirement already satisfied: xyzservices in /opt/anaconda3/lib/python3.12/site-packages (from folium) (2022.9.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/anaconda3/lib/python3.12/site-packages (from jinja2>=2.9->folium) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/anaconda3/lib/python3.12/site-packages (from requests->folium) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /opt/anaconda3/lib/python3.12/site-packages (from requests->folium) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/anaconda3/lib/python3.12/site-packages (from requests->folium) (2.2.2)
Requirement already satisfied: certifi>=2017.4.17 in /opt/anaconda3/lib/python3.12/site-packages (from requests->folium) (2024.7.4)
In [518]:
import folium
In [519]:
basemap = folium.Map()
In [520]:
basemap
Out[520]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [ ]:
 
In [521]:
from folium.plugins import HeatMap
In [522]:
HeatMap(rush_uber).add_to(basemap)
Out[522]:
<folium.plugins.heat_map.HeatMap at 0x312a23b60>
In [523]:
basemap
Out[523]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [ ]:
 
In [524]:
print("Observation from HeatMap(rush_uber).add_to(basemap):\n\nHeat Map on a Geographic Map: This code generates a heat map on a map, where the variable rush_uber likely contains geographical data (like latitude and longitude coordinates) representing locations of Uber rides during rush hours.\n\nThe heat map will display areas of high and low ride density, using color gradients to indicate traffic or ride concentration.\n\nHot zones (redder colors) will represent areas with the highest Uber activity during rush hours, such as city centers or high-demand neighborhoods.\n\nCooler zones (bluer colors) indicate areas with fewer rides or less demand.")
Observation from HeatMap(rush_uber).add_to(basemap):

Heat Map on a Geographic Map: This code generates a heat map on a map, where the variable rush_uber likely contains geographical data (like latitude and longitude coordinates) representing locations of Uber rides during rush hours.

The heat map will display areas of high and low ride density, using color gradients to indicate traffic or ride concentration.

Hot zones (redder colors) will represent areas with the highest Uber activity during rush hours, such as city centers or high-demand neighborhoods.

Cooler zones (bluer colors) indicate areas with fewer rides or less demand.
In [ ]:
 
In [ ]:
 
In [525]:
final.columns
Out[525]:
Index(['Date', 'Time', 'State', 'PuFrom', 'Address', 'Street', 'Date/Time',
       'Lat', 'Lon', 'Base'],
      dtype='object')
In [526]:
final.head(2)
Out[526]:
Date Time State PuFrom Address Street Date/Time Lat Lon Base
0 2014.07.06 14:30 NY ... MANHATTAN 50 MURRAY ST NaN NaN NaN NaN
1 2014.07.04 7:15 NY ... MANHATTAN 143 AVENUE B NaN NaN NaN NaN
In [527]:
final.dtypes
Out[527]:
Date          object
Time          object
State         object
PuFrom        object
Address       object
Street        object
Date/Time     object
Lat          float64
Lon          float64
Base          object
dtype: object
In [528]:
final['Date/Time'][0]
Out[528]:
0                 NaN
0    6/1/2014 0:00:00
0    7/1/2014 0:03:00
0    9/1/2014 0:01:00
0    8/1/2014 0:03:00
Name: Date/Time, dtype: object
In [529]:
final['Date/Time'] = pd.to_datetime(final['Date/Time'],format="%m/%d/%Y %H:%M:%S")
In [ ]:
 
In [530]:
print("Observation: The Date/Time column is being converted into a datetime format. This step is crucial for extracting specific date-time features (such as day or hour) and for ensuring consistency when analyzing time-based data.")
Observation: The Date/Time column is being converted into a datetime format. This step is crucial for extracting specific date-time features (such as day or hour) and for ensuring consistency when analyzing time-based data.
In [ ]:
 
In [531]:
final['Date/Time'].dtype
Out[531]:
dtype('<M8[ns]')
In [532]:
final['day'] = final['Date/Time'].dt.day
final['hour'] = final['Date/Time'].dt.hour
In [ ]:
 
In [533]:
print("Observation: New columns day and hour are extracted from the Date/Time column. This is often done for time-series analysis, where splitting data into components like day, hour, month, etc., helps analyze trends and patterns in specific time windows.")
Observation: New columns day and hour are extracted from the Date/Time column. This is often done for time-series analysis, where splitting data into components like day, hour, month, etc., helps analyze trends and patterns in specific time windows.
In [ ]:
 
In [534]:
final.head(3)
Out[534]:
Date Time State PuFrom Address Street Date/Time Lat Lon Base day hour
0 2014.07.06 14:30 NY ... MANHATTAN 50 MURRAY ST NaT NaN NaN NaN NaN NaN
1 2014.07.04 7:15 NY ... MANHATTAN 143 AVENUE B NaT NaN NaN NaN NaN NaN
2 2014.07.05 5:45 NY ... MANHATTAN 125 CHRISTOPHER ST NaT NaN NaN NaN NaN NaN
In [ ]:
 
In [535]:
print("Observation: The first few rows of the DataFrame final are displayed. This step is typically used for validation, ensuring the new columns (day, hour, etc.) were correctly added and the data is in the expected format.")
Observation: The first few rows of the DataFrame final are displayed. This step is typically used for validation, ensuring the new columns (day, hour, etc.) were correctly added and the data is in the expected format.
In [ ]:
 
In [536]:
pivot = final.groupby(['day','hour']).size().unstack()
In [537]:
pivot
Out[537]:
hour 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ... 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0
day
1.0 2702 1705 1107 1050 1034 1370 2139 3140 3296 3241 ... 5162 5621 5928 6446 5804 5438 5206 5098 4371 2975
2.0 1922 1292 892 1159 1467 2122 3346 4481 4492 3418 ... 4898 5872 7127 7675 7785 6797 6202 5785 4721 3354
3.0 2332 1520 1022 1106 1154 1846 3272 4346 4126 3572 ... 5347 6173 7186 7398 7411 6670 6056 5097 3961 2532
4.0 1436 877 670 962 1223 2028 3606 4638 4423 3660 ... 5153 6019 6611 7162 6995 6306 6188 6049 5347 3518
5.0 1882 1063 766 909 1207 2256 3796 4988 5185 4456 ... 5291 6051 6907 7696 7396 6953 6960 7060 6354 4446
6.0 2875 1815 1276 1141 1152 1588 2648 3486 3682 3333 ... 5542 6442 7069 7421 7222 6607 6331 6800 6727 5245
7.0 3348 2127 1381 1312 1298 1705 2398 3390 3583 3370 ... 5263 5963 6854 6996 6812 6084 6081 6036 5569 4044
8.0 2379 1402 939 1207 1427 2218 3288 4608 4653 3818 ... 5238 6141 6798 7204 6807 6199 5863 5761 4911 3271
9.0 2118 1351 977 1148 1432 2421 4091 5338 5544 4511 ... 5649 6482 7035 8101 7872 7009 6640 6361 5295 3629
10.0 2348 1564 1084 1046 1227 2042 3616 4791 4897 3889 ... 5510 6682 7608 8288 7608 6982 7134 6487 5101 3085
11.0 1599 931 654 995 1468 2040 3984 5592 5583 4255 ... 5791 6686 7606 8268 8003 7214 6950 6860 5949 3992
12.0 2460 1618 1068 1048 1340 2113 3590 5007 5052 4099 ... 6018 7056 7975 8942 9511 8292 8014 7939 7742 6131
13.0 4101 2724 1859 1481 1384 1835 3158 4344 4934 4528 ... 6467 7149 8161 9399 9495 8134 6893 6693 6105 5233
14.0 3398 2176 1480 1406 1347 1861 2675 3633 3774 3626 ... 5028 5823 6967 7431 6975 6940 6941 6595 5587 3700
15.0 2063 1269 917 1129 1463 2236 3473 4612 4710 3756 ... 5407 5935 6891 8046 8057 7594 6974 5749 4751 3175
16.0 1925 1260 893 1071 1390 2211 3923 5393 5532 4477 ... 5152 6277 7387 7865 7137 6730 6631 6291 5189 3600
17.0 2173 1529 1109 1124 1416 2158 3537 4935 4565 3741 ... 5512 6526 7298 8000 7755 6913 7036 6641 5234 3125
18.0 1560 942 630 1034 1459 2499 4099 5715 5347 4206 ... 5699 6914 7866 8511 8175 7647 7518 7412 6197 4162
19.0 2448 1670 1245 1299 1350 2029 3671 4876 4764 3959 ... 5595 6738 7477 8237 8197 7704 8168 8081 7404 5490
20.0 3762 2671 1757 1504 1255 1646 2623 3614 3850 3613 ... 5696 6534 7152 7660 7126 6590 6597 6849 6507 5225
21.0 3975 2996 1816 1424 1431 1838 2596 3645 3790 3467 ... 4960 5667 6588 6661 6684 6074 5971 6189 5953 4464
22.0 2389 1414 962 1146 1455 2329 3490 4588 4772 3852 ... 5178 6256 7166 7069 6499 6258 5984 5792 4780 3354
23.0 1965 1244 913 1104 1452 2334 3640 4776 4581 3818 ... 5356 6402 7036 7557 7539 7326 7410 6987 5750 3645
24.0 2317 1558 1092 1119 1392 2158 3446 4643 4535 3977 ... 5365 6199 7061 7756 7128 6668 6739 6241 5135 3027
25.0 1599 963 702 1070 1602 2493 4436 6405 5922 4950 ... 5650 6612 7552 8084 8087 6871 6770 6902 5719 4137
26.0 2468 1702 1148 1233 1363 2008 3370 4636 4462 3989 ... 5660 6715 7436 8098 8303 7639 7668 7989 7136 5626
27.0 3708 2684 1829 1613 1288 1721 2508 3632 3879 3730 ... 5675 6665 7328 7428 7341 6673 6372 7135 6661 5393
28.0 3759 2460 1689 1515 1419 1941 2662 3497 3848 3892 ... 5388 6057 6596 6651 6417 5980 5875 5787 5030 3679
29.0 2289 1606 1264 1450 1555 2307 3465 4511 4716 4098 ... 5107 5943 6344 6751 6558 5941 5863 5639 4587 3092
30.0 1796 1186 892 1079 1372 2255 3800 4990 5043 4021 ... 5253 6270 7164 7659 7362 6815 6827 6302 5080 2613
31.0 1415 934 759 662 629 828 1307 1808 1916 1970 ... 2853 3531 3853 3977 3899 3695 3630 3622 3392 2372

31 rows × 24 columns

In [538]:
pivot.style.background_gradient()
Out[538]:
hour 0.000000 1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 11.000000 12.000000 13.000000 14.000000 15.000000 16.000000 17.000000 18.000000 19.000000 20.000000 21.000000 22.000000 23.000000
day                                                
1.000000 2702 1705 1107 1050 1034 1370 2139 3140 3296 3241 3369 3533 3761 4401 5162 5621 5928 6446 5804 5438 5206 5098 4371 2975
2.000000 1922 1292 892 1159 1467 2122 3346 4481 4492 3418 3425 3530 3783 4091 4898 5872 7127 7675 7785 6797 6202 5785 4721 3354
3.000000 2332 1520 1022 1106 1154 1846 3272 4346 4126 3572 3663 3859 4068 4516 5347 6173 7186 7398 7411 6670 6056 5097 3961 2532
4.000000 1436 877 670 962 1223 2028 3606 4638 4423 3660 3595 3644 3763 4467 5153 6019 6611 7162 6995 6306 6188 6049 5347 3518
5.000000 1882 1063 766 909 1207 2256 3796 4988 5185 4456 4029 3892 3784 4333 5291 6051 6907 7696 7396 6953 6960 7060 6354 4446
6.000000 2875 1815 1276 1141 1152 1588 2648 3486 3682 3333 3626 4020 4146 4853 5542 6442 7069 7421 7222 6607 6331 6800 6727 5245
7.000000 3348 2127 1381 1312 1298 1705 2398 3390 3583 3370 3683 4004 4192 4683 5263 5963 6854 6996 6812 6084 6081 6036 5569 4044
8.000000 2379 1402 939 1207 1427 2218 3288 4608 4653 3818 3779 3873 4081 4588 5238 6141 6798 7204 6807 6199 5863 5761 4911 3271
9.000000 2118 1351 977 1148 1432 2421 4091 5338 5544 4511 4083 4032 4059 4737 5649 6482 7035 8101 7872 7009 6640 6361 5295 3629
10.000000 2348 1564 1084 1046 1227 2042 3616 4791 4897 3889 3833 4121 4127 4716 5510 6682 7608 8288 7608 6982 7134 6487 5101 3085
11.000000 1599 931 654 995 1468 2040 3984 5592 5583 4255 4005 4061 4073 4833 5791 6686 7606 8268 8003 7214 6950 6860 5949 3992
12.000000 2460 1618 1068 1048 1340 2113 3590 5007 5052 4099 4062 4186 4292 5052 6018 7056 7975 8942 9511 8292 8014 7939 7742 6131
13.000000 4101 2724 1859 1481 1384 1835 3158 4344 4934 4528 4322 4541 4702 5197 6467 7149 8161 9399 9495 8134 6893 6693 6105 5233
14.000000 3398 2176 1480 1406 1347 1861 2675 3633 3774 3626 3763 3967 3909 4326 5028 5823 6967 7431 6975 6940 6941 6595 5587 3700
15.000000 2063 1269 917 1129 1463 2236 3473 4612 4710 3756 3649 3809 3870 5113 5407 5935 6891 8046 8057 7594 6974 5749 4751 3175
16.000000 1925 1260 893 1071 1390 2211 3923 5393 5532 4477 4221 3933 3784 4234 5152 6277 7387 7865 7137 6730 6631 6291 5189 3600
17.000000 2173 1529 1109 1124 1416 2158 3537 4935 4565 3741 3821 4007 4199 4809 5512 6526 7298 8000 7755 6913 7036 6641 5234 3125
18.000000 1560 942 630 1034 1459 2499 4099 5715 5347 4206 3945 4103 4050 4767 5699 6914 7866 8511 8175 7647 7518 7412 6197 4162
19.000000 2448 1670 1245 1299 1350 2029 3671 4876 4764 3959 4040 4057 4322 4939 5595 6738 7477 8237 8197 7704 8168 8081 7404 5490
20.000000 3762 2671 1757 1504 1255 1646 2623 3614 3850 3613 3935 4340 4357 5082 5696 6534 7152 7660 7126 6590 6597 6849 6507 5225
21.000000 3975 2996 1816 1424 1431 1838 2596 3645 3790 3467 3725 3951 4087 4541 4960 5667 6588 6661 6684 6074 5971 6189 5953 4464
22.000000 2389 1414 962 1146 1455 2329 3490 4588 4772 3852 3817 3922 3993 4574 5178 6256 7166 7069 6499 6258 5984 5792 4780 3354
23.000000 1965 1244 913 1104 1452 2334 3640 4776 4581 3818 3713 3854 3990 4532 5356 6402 7036 7557 7539 7326 7410 6987 5750 3645
24.000000 2317 1558 1092 1119 1392 2158 3446 4643 4535 3977 3845 4120 3980 4553 5365 6199 7061 7756 7128 6668 6739 6241 5135 3027
25.000000 1599 963 702 1070 1602 2493 4436 6405 5922 4950 4402 4346 4361 4735 5650 6612 7552 8084 8087 6871 6770 6902 5719 4137
26.000000 2468 1702 1148 1233 1363 2008 3370 4636 4462 3989 4083 4063 4211 4801 5660 6715 7436 8098 8303 7639 7668 7989 7136 5626
27.000000 3708 2684 1829 1613 1288 1721 2508 3632 3879 3730 4069 4530 4574 5154 5675 6665 7328 7428 7341 6673 6372 7135 6661 5393
28.000000 3759 2460 1689 1515 1419 1941 2662 3497 3848 3892 4017 4249 4273 4610 5388 6057 6596 6651 6417 5980 5875 5787 5030 3679
29.000000 2289 1606 1264 1450 1555 2307 3465 4511 4716 4098 3944 3884 3859 4445 5107 5943 6344 6751 6558 5941 5863 5639 4587 3092
30.000000 1796 1186 892 1079 1372 2255 3800 4990 5043 4021 3881 3912 4087 4663 5253 6270 7164 7659 7362 6815 6827 6302 5080 2613
31.000000 1415 934 759 662 629 828 1307 1808 1916 1970 1944 2072 2183 2472 2853 3531 3853 3977 3899 3695 3630 3622 3392 2372
In [ ]:
 
In [539]:
print("Observation: This code generates a pivot table where the rows represent days, the columns represent hours, and the values reflect the number of Uber rides at each combination of day and hour. The background gradient is added for better visualization, making it easier to spot peaks and valleys in ride volume across days and hours.")
Observation: This code generates a pivot table where the rows represent days, the columns represent hours, and the values reflect the number of Uber rides at each combination of day and hour. The background gradient is added for better visualization, making it easier to spot peaks and valleys in ride volume across days and hours.
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [540]:
def gen_pivot_table(df, col1, col2):
    pivot = final.groupby([col1 , col2]).size().unstack()
    return pivot.style.background_gradient()
In [ ]:
 
In [541]:
print("Observation: This helper function automates the process of creating and styling pivot tables by grouping data based on two specified columns (col1 and col2). This allows the user to generate pivot tables dynamically for different columns and datasets.")
Observation: This helper function automates the process of creating and styling pivot tables by grouping data based on two specified columns (col1 and col2). This allows the user to generate pivot tables dynamically for different columns and datasets.
In [ ]:
 
In [542]:
final.columns
Out[542]:
Index(['Date', 'Time', 'State', 'PuFrom', 'Address', 'Street', 'Date/Time',
       'Lat', 'Lon', 'Base', 'day', 'hour'],
      dtype='object')
In [ ]:
 
In [ ]:
 
In [543]:
print("Observation: This line lists all column names of the DataFrame final. It helps verify the structure of the dataset after feature engineering or loading the data.")
Observation: This line lists all column names of the DataFrame final. It helps verify the structure of the dataset after feature engineering or loading the data.
In [ ]:
 
In [544]:
gen_pivot_table(final , "day", "hour")
Out[544]:
hour 0.000000 1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 11.000000 12.000000 13.000000 14.000000 15.000000 16.000000 17.000000 18.000000 19.000000 20.000000 21.000000 22.000000 23.000000
day                                                
1.000000 2702 1705 1107 1050 1034 1370 2139 3140 3296 3241 3369 3533 3761 4401 5162 5621 5928 6446 5804 5438 5206 5098 4371 2975
2.000000 1922 1292 892 1159 1467 2122 3346 4481 4492 3418 3425 3530 3783 4091 4898 5872 7127 7675 7785 6797 6202 5785 4721 3354
3.000000 2332 1520 1022 1106 1154 1846 3272 4346 4126 3572 3663 3859 4068 4516 5347 6173 7186 7398 7411 6670 6056 5097 3961 2532
4.000000 1436 877 670 962 1223 2028 3606 4638 4423 3660 3595 3644 3763 4467 5153 6019 6611 7162 6995 6306 6188 6049 5347 3518
5.000000 1882 1063 766 909 1207 2256 3796 4988 5185 4456 4029 3892 3784 4333 5291 6051 6907 7696 7396 6953 6960 7060 6354 4446
6.000000 2875 1815 1276 1141 1152 1588 2648 3486 3682 3333 3626 4020 4146 4853 5542 6442 7069 7421 7222 6607 6331 6800 6727 5245
7.000000 3348 2127 1381 1312 1298 1705 2398 3390 3583 3370 3683 4004 4192 4683 5263 5963 6854 6996 6812 6084 6081 6036 5569 4044
8.000000 2379 1402 939 1207 1427 2218 3288 4608 4653 3818 3779 3873 4081 4588 5238 6141 6798 7204 6807 6199 5863 5761 4911 3271
9.000000 2118 1351 977 1148 1432 2421 4091 5338 5544 4511 4083 4032 4059 4737 5649 6482 7035 8101 7872 7009 6640 6361 5295 3629
10.000000 2348 1564 1084 1046 1227 2042 3616 4791 4897 3889 3833 4121 4127 4716 5510 6682 7608 8288 7608 6982 7134 6487 5101 3085
11.000000 1599 931 654 995 1468 2040 3984 5592 5583 4255 4005 4061 4073 4833 5791 6686 7606 8268 8003 7214 6950 6860 5949 3992
12.000000 2460 1618 1068 1048 1340 2113 3590 5007 5052 4099 4062 4186 4292 5052 6018 7056 7975 8942 9511 8292 8014 7939 7742 6131
13.000000 4101 2724 1859 1481 1384 1835 3158 4344 4934 4528 4322 4541 4702 5197 6467 7149 8161 9399 9495 8134 6893 6693 6105 5233
14.000000 3398 2176 1480 1406 1347 1861 2675 3633 3774 3626 3763 3967 3909 4326 5028 5823 6967 7431 6975 6940 6941 6595 5587 3700
15.000000 2063 1269 917 1129 1463 2236 3473 4612 4710 3756 3649 3809 3870 5113 5407 5935 6891 8046 8057 7594 6974 5749 4751 3175
16.000000 1925 1260 893 1071 1390 2211 3923 5393 5532 4477 4221 3933 3784 4234 5152 6277 7387 7865 7137 6730 6631 6291 5189 3600
17.000000 2173 1529 1109 1124 1416 2158 3537 4935 4565 3741 3821 4007 4199 4809 5512 6526 7298 8000 7755 6913 7036 6641 5234 3125
18.000000 1560 942 630 1034 1459 2499 4099 5715 5347 4206 3945 4103 4050 4767 5699 6914 7866 8511 8175 7647 7518 7412 6197 4162
19.000000 2448 1670 1245 1299 1350 2029 3671 4876 4764 3959 4040 4057 4322 4939 5595 6738 7477 8237 8197 7704 8168 8081 7404 5490
20.000000 3762 2671 1757 1504 1255 1646 2623 3614 3850 3613 3935 4340 4357 5082 5696 6534 7152 7660 7126 6590 6597 6849 6507 5225
21.000000 3975 2996 1816 1424 1431 1838 2596 3645 3790 3467 3725 3951 4087 4541 4960 5667 6588 6661 6684 6074 5971 6189 5953 4464
22.000000 2389 1414 962 1146 1455 2329 3490 4588 4772 3852 3817 3922 3993 4574 5178 6256 7166 7069 6499 6258 5984 5792 4780 3354
23.000000 1965 1244 913 1104 1452 2334 3640 4776 4581 3818 3713 3854 3990 4532 5356 6402 7036 7557 7539 7326 7410 6987 5750 3645
24.000000 2317 1558 1092 1119 1392 2158 3446 4643 4535 3977 3845 4120 3980 4553 5365 6199 7061 7756 7128 6668 6739 6241 5135 3027
25.000000 1599 963 702 1070 1602 2493 4436 6405 5922 4950 4402 4346 4361 4735 5650 6612 7552 8084 8087 6871 6770 6902 5719 4137
26.000000 2468 1702 1148 1233 1363 2008 3370 4636 4462 3989 4083 4063 4211 4801 5660 6715 7436 8098 8303 7639 7668 7989 7136 5626
27.000000 3708 2684 1829 1613 1288 1721 2508 3632 3879 3730 4069 4530 4574 5154 5675 6665 7328 7428 7341 6673 6372 7135 6661 5393
28.000000 3759 2460 1689 1515 1419 1941 2662 3497 3848 3892 4017 4249 4273 4610 5388 6057 6596 6651 6417 5980 5875 5787 5030 3679
29.000000 2289 1606 1264 1450 1555 2307 3465 4511 4716 4098 3944 3884 3859 4445 5107 5943 6344 6751 6558 5941 5863 5639 4587 3092
30.000000 1796 1186 892 1079 1372 2255 3800 4990 5043 4021 3881 3912 4087 4663 5253 6270 7164 7659 7362 6815 6827 6302 5080 2613
31.000000 1415 934 759 662 629 828 1307 1808 1916 1970 1944 2072 2183 2472 2853 3531 3853 3977 3899 3695 3630 3622 3392 2372
In [ ]:
 
In [545]:
print("Observations:\n\nPeak Hours:\n\nThe frequency of rides is likely concentrated during certain hours of the day (e.g., mornings and evenings), which are typical rush hours. These peaks can indicate when Uber services are most in demand.\n\n Off-Peak Hours:\n\nThere are likely lower counts of rides during late-night hours and early mornings, especially between midnight and 5 AM. This reflects the expected decrease in demand when fewer people are traveling.\n\nDay-to-Day Variations:\n\nSome days might have significantly more rides compared to others. For instance, weekends or Fridays might have higher ride counts due to social events, while weekdays may show a steady but lower pattern of usage.\n\nConsistent Trends:\n\nThe heat map (from the background gradient) allows us to easily spot consistent high-traffic periods across multiple days, such as weekday morning and evening commutes, highlighting the repetitive nature of travel patterns.\n\nUnique Outliers:\n\nCertain hours of a particular day might show an unexpected spike or drop in rides, possibly due to events or holidays, where demand deviates from the regular pattern.\n\n\n\nThis analysis provides a clear visual representation of the busiest hours and days for Uber rides, helping to understand patterns of demand.")
Observations:

Peak Hours:

The frequency of rides is likely concentrated during certain hours of the day (e.g., mornings and evenings), which are typical rush hours. These peaks can indicate when Uber services are most in demand.

 Off-Peak Hours:

There are likely lower counts of rides during late-night hours and early mornings, especially between midnight and 5 AM. This reflects the expected decrease in demand when fewer people are traveling.

Day-to-Day Variations:

Some days might have significantly more rides compared to others. For instance, weekends or Fridays might have higher ride counts due to social events, while weekdays may show a steady but lower pattern of usage.

Consistent Trends:

The heat map (from the background gradient) allows us to easily spot consistent high-traffic periods across multiple days, such as weekday morning and evening commutes, highlighting the repetitive nature of travel patterns.

Unique Outliers:

Certain hours of a particular day might show an unexpected spike or drop in rides, possibly due to events or holidays, where demand deviates from the regular pattern.



This analysis provides a clear visual representation of the busiest hours and days for Uber rides, helping to understand patterns of demand.
In [ ]:
 
In [546]:
#######
The final result of the entire project is an in-depth analysis of Uber ride patterns. Here are the key takeaways and results that summarize the project:

1. Temporal Analysis
Peak Hour Insights: The project extracts information such as day and hour from the Date/Time data, allowing for the analysis of peak ride hours. The pivot tables and heat maps reveal specific periods when Uber rides are in high demand (e.g., rush hours in the morning and evening).
Day-to-Day Variations: The frequency of rides varies across different days, indicating that weekends or specific days might experience higher demand.
    
2. Geographic Analysis (Using Heat Maps)
Heat Map of Uber Rides: Using HeatMap(rush_uber).add_to(basemap), a heat map visualizes the geographic distribution of Uber rides during peak hours. This shows areas of high and low ride activity, offering insights into spatial patterns in ride demand. This helps identify hotspots where Uber services are most in demand (e.g., city centers or popular venues).
    
3. Base Number Analysis
Dispatching Base Numbers: Violin plots and heat maps show how the activity of different dispatching base numbers varies by time of day and day of the week. This analysis reveals operational patterns and identifies which bases handle the most rides during peak periods.

4. Visualizations
Violin Plot: Used to compare the distribution of active vehicles across different dispatching base numbers.
Pivot Tables and Heat Maps: Both provide detailed visualizations of how ride volume fluctuates over time and space.

5. Overall Insights:
Operational Efficiency: By visualizing ride density and patterns, Uber can optimize its fleet management, ensuring that more vehicles are available in high-demand areas during peak hours.
Geographic Hotspots: The heat map helps Uber focus its resources in areas with the most demand, helping drivers reduce idle time and increasing ride availability in key zones.
Temporal Demand: The day and hour-based analysis identifies consistent patterns in demand, enabling Uber to predict when and where rides will be needed most.

Conclusion:
The final result of the project provides valuable insights into both the temporal and geographic distribution of Uber rides, enabling more informed decisions for fleet management, driver dispatching, and customer service improvement.
  Cell In[546], line 2
    The final result of the entire project is an in-depth analysis of Uber ride patterns. Here are the key takeaways and results that summarize the project:
        ^
SyntaxError: invalid syntax
In [ ]: